Reanalysis of "Hexamethylene amiloride synergizes with venetoclax to induce lysosome-dependent cell death in acute myeloid leukemia." by Jiang, X. et al., iScience, 2024¶
Jiang, X., Huang, K., Sun, X., Li, Y., Hua, L., Liu, F., Huang, R., Du, J., & Zeng, H. (2024). Hexamethylene amiloride synergizes with venetoclax to induce lysosome-dependent cell death in acute myeloid leukemia.. iScience, 27(1), 108691.
Abstract¶
In this study, Jiang et al. [1] profiled acute myeloid leukemia (AML) cell lines under treatment with Hexamethylene amiloride (HA) and venetoclax, alone and in combination, to further our understanding of the therapeutic potential of targeting NHE1 in AML. The reanalysis of this dataset includes a comprehensive RNA-seq analysis pipeline consisting of UMAP (Uniform Manifold Approximation and Projection) [2], PCA (Principal Component Analysis) [3], and t-SNE (t-distributed Stochastic Neighbor Embedding) [4] plots for visualizing sample distributions. A clustergram heatmap provides an overview of gene expression patterns across samples. Differential gene expression analysis is performed for each control and perturbation sample pair. Enrichment analysis for each resulting gene signature is conducted using Enrichr [5, 6, 7]. Transcription factor analysis of these gene signatures is performed utilizing ChEA3 [8]. Furthermore, the reanalysis incorporates reverser and mimicker drug match analysis using L2S2 [9] and DRUG-seqr [10], considering both FDA-approved and non-FDA-approved compounds. Results are presented as tables and bar charts.
This abstract was generated with the assistance of Gemini 2.0 Flash.
Methods¶
RNA-seq alignment
Gene count matrices were obtained from ARCHS4 [11], which preprocessed the raw FASTQ data using the Kallisto [12] and STAR [] pseudoalignment algorithm.
Gene matrix processing
The raw gene matrix was filtered to remove genes that do not have an average of 3 reads across the samples. It was then quantile, log2, and z-score normalized. A regex-based function was used to infer whether individual samples belong to a “control” or a “perturbation” group by processing the metadata associated with each sample.
Dimensionality Reduction Visualization
Three types of dimensionality reduction techniques were applied on the processed expression matrices: UMAP[2], PCA[3], and t-SNE[4]. UMAP was calculated by the UMAP Python package and PCA and t-SNE were calculated using the Scikit-Learn Python library. The samples were then represented on 2D scatterplots.
Clustergram Heatmap
As a preliminary step, the top 1000 genes exhibiting most variability were selected. Using this new set, clustergram heatmaps were generated. Two versions of the clustergram exist: an interactive one generated by Clustergrammer [13] and a publication-ready alternative.
Differentially Expressed Genes Calculation and Volcano Plot
Differentially expressed genes between the control and perturbation samples were calculated using Limma Voom [14]. The logFC and -log10p values of each gene were visualized as a volcano scatterplot. Upregulated and downregulated genes were selected according to this criteria: p < 0.05 and |logFC| > 1.0.
Enrichr Enrichment Analysis
The upregulated and down-regulated sets were separately submitted to Enrichr [5, 6, 7]. These sets were compared against libraries from ChEA [8], ARCHS4 [12], Reactome Pathways [15], MGI Mammalian Phenotype [16], Gene Ontology Biological Processes [17], GWAS Catalog [18], KEGG [19, 20, 21], and WikiPathways [22]. The top matched terms from each library and their respective -log10p values were visualized as barplots.
Chea3 Transcription Factor Analysis
The upregulated and down-regulated sets were separately submitted to Chea3 [8]. These sets were compared against the libraries ARCHS4 Coexpression [12], GTEx Coexpression [23], Enrichr [5, 6, 7], ENCODE ChIP-seq [24, 25], ReMap ChIP-seq [26], and Literature-mined ChIP-seq. The top matched TFs were ranked according to their average score across each library and represented as barplots.
L2S2 and Drug-seqr drug analysis
The top 500 up and downregulated sets were submitted simulataneously to identify reverser and mimicker molecules, both FDA and non-FDA approved, from the L2S2 [9] and Drug-seqr [10] databases. The top matched molecules were compiled into tables and visualized as barplots.
| GSM7884755 | GSM7884756 | GSM7884757 | GSM7884758 | GSM7884759 | GSM7884760 | GSM7884761 | GSM7884762 | GSM7884763 | GSM7884764 | GSM7884765 | GSM7884766 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TSPAN6 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 2 |
| TNMD | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| DPM1 | 1475 | 1322 | 1197 | 1580 | 1236 | 1349 | 1090 | 899 | 1048 | 1126 | 1334 | 1148 |
| SCYL3 | 417 | 291 | 281 | 357 | 288 | 300 | 385 | 296 | 349 | 264 | 241 | 260 |
| C1ORF112 | 368 | 304 | 305 | 524 | 310 | 390 | 374 | 255 | 270 | 142 | 176 | 128 |
table 1: This is a preview of the first 5 rows of the raw RNA-seq expression matrix from GSE247175.
Results¶
Dimensionality Reduction¶
UMAP¶
Figure 1: This figure displays a 2D scatter plot of a UMAP decomposition of the sample data. Each point represents an individual sample, colored by its experimental group.
PCA¶
Figure 2: This figure displays a 2D scatter plot of a PCA decomposition of the sample data. Each point represents an individual sample, colored by its experimental group.
t-SNE¶
Figure 3: This figure displays a 2D scatter plot using a t-SNE decomposition of the sample data. Each point represents an individual sample, colored by its experimental group.
Clustergram Heatmaps¶
Figure 4: The figure contains an interactive heatmap displaying gene expression for each sample in the RNA-seq dataset. Every row of the heatmap represents a gene, every column represents a sample, and every cell displays normalized gene expression values. The heatmap additionally features color bars beside each column which represent prior knowledge of each sample, such as the tissue of origin or experimental treatment.
Figure 5: this figure is a clustergram produced with the graphing library Plotly. It sacrifices some interactivity for a more polished look.
Differentially Expressed Genes Calculation and Volcano Plots¶
| logFC | AveExpr | t | P.Value | adj.P.Val | B | |
|---|---|---|---|---|---|---|
| gene_symbol | ||||||
| SRGN | 2.14 | 9.83 | 105.66 | 3.328260e-20 | 6.334484e-16 | 36.70 |
| CHI3L1 | 3.15 | 9.74 | 101.23 | 5.753391e-20 | 6.334484e-16 | 36.07 |
| FTH1 | 2.63 | 10.86 | 76.01 | 2.238429e-18 | 1.504113e-14 | 32.71 |
| HSPA8 | 1.85 | 9.18 | 74.83 | 2.732266e-18 | 1.504113e-14 | 32.46 |
| FTL | 1.46 | 11.82 | 72.27 | 4.263324e-18 | 1.877568e-14 | 31.81 |
Table 2: This is a preview of the first 5 rows of the differentially expressed gene table calculated by Limma Voom.
| logFC | AveExpr | t | P.Value | adj.P.Val | B | |
|---|---|---|---|---|---|---|
| gene_symbol | ||||||
| HSPA8 | 1.35 | 8.98 | 32.01 | 3.204121e-13 | 7.055474e-09 | 20.70 |
| SCD | 1.21 | 8.72 | 26.74 | 2.851548e-12 | 3.139554e-08 | 18.65 |
| FADS2 | 1.32 | 7.59 | 18.41 | 2.516819e-10 | 2.792420e-07 | 14.26 |
| SQLE | 1.05 | 6.75 | 18.39 | 2.553503e-10 | 2.792420e-07 | 14.22 |
| INSIG1 | 1.36 | 6.41 | 18.11 | 3.059481e-10 | 2.929121e-07 | 14.02 |
Table 3: This is a preview of the first 5 rows of the differentially expressed gene table calculated by Limma Voom.
| logFC | AveExpr | t | P.Value | adj.P.Val | B | |
|---|---|---|---|---|---|---|
| gene_symbol | ||||||
| TFRC | 0.99 | 8.48 | 28.64 | 2.873655e-12 | 1.581947e-08 | 18.67 |
| TUBB4B | 0.93 | 7.84 | 24.34 | 1.898124e-11 | 4.179669e-08 | 16.81 |
| LPCAT1 | 0.88 | 7.85 | 24.04 | 2.193322e-11 | 4.390631e-08 | 16.67 |
| SFPQ | 0.62 | 9.47 | 20.80 | 1.164689e-10 | 1.349813e-07 | 15.02 |
| MT-CO1 | 0.41 | 13.45 | 20.34 | 1.509704e-10 | 1.662185e-07 | 14.07 |
Table 4: This is a preview of the first 5 rows of the differentially expressed gene table calculated by Limma Voom.
dmso-vs-combo
Figure 6: The figure contains an interactive scatter plot which displays the log2-fold changes and statistical significance of each gene calculated by performing a differential gene expression analysis for the comparison dmso-vs-combo. Every point in the plot represents a gene. Red points indicate significantly up-regulated genes, blue points indicate down-regulated genes.
dmso-vs-ha
Figure 7: The figure contains an interactive scatter plot which displays the log2-fold changes and statistical significance of each gene calculated by performing a differential gene expression analysis for the comparison dmso-vs-ha. Every point in the plot represents a gene. Red points indicate significantly up-regulated genes, blue points indicate down-regulated genes.
dmso-vs-ven
Figure 8: The figure contains an interactive scatter plot which displays the log2-fold changes and statistical significance of each gene calculated by performing a differential gene expression analysis for the comparison dmso-vs-ven. Every point in the plot represents a gene. Red points indicate significantly up-regulated genes, blue points indicate down-regulated genes.
Enrichr: Enrichment Analysis¶
Upregulated Set¶
dmso-vs-combo¶
Figure 9: This figure contains several barplots depicting enrichment analysis results on the upregulated gene set. Each barplot corresponds to an individual library from Enrichr, and the top matching terms by p-value are depicted in each. Statistically significant terms are represented as red bars while others are represented as gray. Access your Enrichment results here: https://amp.pharm.mssm.edu/Enrichr/enrich?dataset=f46b595ce1c8abb5b5f47c6e078592d4
dmso-vs-ha¶
Figure 10: This figure contains several barplots depicting enrichment analysis results on the upregulated gene set. Each barplot corresponds to an individual library from Enrichr, and the top matching terms by p-value are depicted in each. Statistically significant terms are represented as red bars while others are represented as gray. Access your Enrichment results here: https://amp.pharm.mssm.edu/Enrichr/enrich?dataset=9e29b635bff0f028bf85ef8e41c65a83
dmso-vs-ven¶
Figure 11: This figure contains several barplots depicting enrichment analysis results on the upregulated gene set. Each barplot corresponds to an individual library from Enrichr, and the top matching terms by p-value are depicted in each. Statistically significant terms are represented as red bars while others are represented as gray. Access your Enrichment results here: https://amp.pharm.mssm.edu/Enrichr/enrich?dataset=372778e4f84d2a69f1b68d91c1f925a1
Downregulated Set¶
dmso-vs-combo¶
Figure 12: This figure contains several barplots depicting enrichment analysis results on the upregulated gene set. Each barplot corresponds to an individual library from Enrichr, and the top matching terms by p-value are depicted in each. Statistically significant terms are represented as red bars while others are represented as gray. Access your Enrichment results here: https://amp.pharm.mssm.edu/Enrichr/enrich?dataset=372778e4f84d2a69f1b68d91c1f925a1
dmso-vs-ha¶
Figure 13: This figure contains several barplots depicting enrichment analysis results on the upregulated gene set. Each barplot corresponds to an individual library from Enrichr, and the top matching terms by p-value are depicted in each. Statistically significant terms are represented as red bars while others are represented as gray. Access your Enrichment results here: https://amp.pharm.mssm.edu/Enrichr/enrich?dataset=372778e4f84d2a69f1b68d91c1f925a1
dmso-vs-ven¶
Figure 14: This figure contains several barplots depicting enrichment analysis results on the upregulated gene set. Each barplot corresponds to an individual library from Enrichr, and the top matching terms by p-value are depicted in each. Statistically significant terms are represented as red bars while others are represented as gray. Access your Enrichment results here: https://amp.pharm.mssm.edu/Enrichr/enrich?dataset=372778e4f84d2a69f1b68d91c1f925a1
CHEA3: Transcription Factor Enrichment Analysis¶
Upregulated Set¶
dmso-vs-combo¶
Figure 15: Horizontal bar chart, y-axis represents transcription factors. Displays the top ranked transcription factors for the upregulated set according to their average integrated scores across all the libraries.
dmso-vs-ha¶
Figure 16: Horizontal bar chart, y-axis represents transcription factors. Displays the top ranked transcription factors for the upregulated set according to their average integrated scores across all the libraries.
dmso-vs-ven¶
Figure 17: Horizontal bar chart, y-axis represents transcription factors. Displays the top ranked transcription factors for the upregulated set according to their average integrated scores across all the libraries.
Downregulated Set¶
dmso-vs-combo¶
Figure 18: Horizontal bar chart, y-axis represents transcription factors. Displays the top ranked transcription factors for the upregulated set according to their average integrated scores across all the libraries.
dmso-vs-ha¶
Figure 19: Horizontal bar chart, y-axis represents transcription factors. Displays the top ranked transcription factors for the upregulated set according to their average integrated scores across all the libraries.
dmso-vs-ven¶
Figure 20: Horizontal bar chart, y-axis represents transcription factors. Displays the top ranked transcription factors for the upregulated set according to their average integrated scores across all the libraries.
L2S2 and DRUG-seqr: Reverser and Mimicker Drugs¶
Reverser Results¶
dmso-vs-combo¶
l2s2_fda
Caught error: No Results for l2s2_fda
l2s2_all
| perturbation | term | pvalueReverse | adjPvalueReverse | oddsRatioReverse | reverserOverlap | approved | count | |
|---|---|---|---|---|---|---|---|---|
| 0 | SL-0101-1 | AICHI002_K562_4H_D11_SL-0101-1_0.04uM up | 8.99e-01 | 1 | 7.01e-01 | 8 | False | 192 |
| 1 | GW-5074 | AICHI001_THP1_4H_N14_GW-5074_2.5uM up | 1.00e+00 | 1 | 0.00e+00 | 0 | False | 712 |
Table 5: Ranked LINCS L1000 signatures predicted to reverse the uploaded geneset.
Figure 21: barplot representation depicting the -log10p values of the top l2s2_all reversers. Red bars represent statistically significant results; otherwise gray.
drugseqr_fda
Caught error: No Results for drugseqr_fda
drugseqr_all
Caught error: No Results for drugseqr_all
dmso-vs-ha¶
l2s2_fda
Caught error: No Results for l2s2_fda
l2s2_all
| perturbation | term | pvalueReverse | adjPvalueReverse | oddsRatioReverse | reverserOverlap | approved | count | |
|---|---|---|---|---|---|---|---|---|
| 0 | SL-0101-1 | AICHI002_K562_4H_D11_SL-0101-1_0.04uM up | 8.99e-01 | 1 | 7.01e-01 | 8 | False | 192 |
| 1 | GW-5074 | AICHI001_THP1_4H_N14_GW-5074_2.5uM up | 1.00e+00 | 1 | 0.00e+00 | 0 | False | 712 |
Table 6: Ranked LINCS L1000 signatures predicted to reverse the uploaded geneset.
Figure 22: barplot representation depicting the -log10p values of the top l2s2_all reversers. Red bars represent statistically significant results; otherwise gray.
drugseqr_fda
Caught error: No Results for drugseqr_fda
drugseqr_all
Caught error: No Results for drugseqr_all
dmso-vs-ven¶
l2s2_fda
Caught error: No Results for l2s2_fda
l2s2_all
| perturbation | term | pvalueReverse | adjPvalueReverse | oddsRatioReverse | reverserOverlap | approved | count | |
|---|---|---|---|---|---|---|---|---|
| 0 | SL-0101-1 | AICHI002_K562_4H_D11_SL-0101-1_0.04uM up | 8.99e-01 | 1 | 7.01e-01 | 8 | False | 192 |
| 1 | GW-5074 | AICHI001_THP1_4H_N14_GW-5074_2.5uM up | 1.00e+00 | 1 | 0.00e+00 | 0 | False | 712 |
Table 7: Ranked LINCS L1000 signatures predicted to reverse the uploaded geneset.
Figure 23: barplot representation depicting the -log10p values of the top l2s2_all reversers. Red bars represent statistically significant results; otherwise gray.
drugseqr_fda
Caught error: No Results for drugseqr_fda
drugseqr_all
Caught error: No Results for drugseqr_all
Mimicker Results¶
dmso-vs-combo¶
l2s2_fda
Caught error: No Results for l2s2_fda
l2s2_all
| perturbation | term | pvalueMimic | adjPvalueMimic | oddsRatioMimic | mimickerOverlap | approved | count | |
|---|---|---|---|---|---|---|---|---|
| 0 | SL-0101-1 | AICHI002_K562_4H_D11_SL-0101-1_0.04uM up | 3.77e-09 | 6.33e-03 | 3.24e+00 | 35 | False | 192 |
| 1 | GW-5074 | AICHI001_THP1_4H_N14_GW-5074_2.5uM up | 1.34e-08 | 1.13e-02 | 3.14e+00 | 34 | False | 712 |
Table 8: Ranked LINCS L1000 signatures predicted to mimic the uploaded geneset.
Figure 24: barplot representation depicting the -log10p values of the top l2s2_all mimickers. Red bars represent statistically significant results; otherwise gray.
drugseqr_fda
Caught error: No Results for drugseqr_fda
drugseqr_all
Caught error: No Results for drugseqr_all
dmso-vs-ha¶
l2s2_fda
Caught error: No Results for l2s2_fda
l2s2_all
| perturbation | term | pvalueMimic | adjPvalueMimic | oddsRatioMimic | mimickerOverlap | approved | count | |
|---|---|---|---|---|---|---|---|---|
| 0 | SL-0101-1 | AICHI002_K562_4H_D11_SL-0101-1_0.04uM up | 3.77e-09 | 6.33e-03 | 3.24e+00 | 35 | False | 192 |
| 1 | GW-5074 | AICHI001_THP1_4H_N14_GW-5074_2.5uM up | 1.34e-08 | 1.13e-02 | 3.14e+00 | 34 | False | 712 |
Table 9: Ranked LINCS L1000 signatures predicted to mimic the uploaded geneset.
Figure 25: barplot representation depicting the -log10p values of the top l2s2_all mimickers. Red bars represent statistically significant results; otherwise gray.
drugseqr_fda
Caught error: No Results for drugseqr_fda
drugseqr_all
Caught error: No Results for drugseqr_all
dmso-vs-ven¶
l2s2_fda
Caught error: No Results for l2s2_fda
l2s2_all
| perturbation | term | pvalueMimic | adjPvalueMimic | oddsRatioMimic | mimickerOverlap | approved | count | |
|---|---|---|---|---|---|---|---|---|
| 0 | SL-0101-1 | AICHI002_K562_4H_D11_SL-0101-1_0.04uM up | 3.77e-09 | 6.33e-03 | 3.24e+00 | 35 | False | 192 |
| 1 | GW-5074 | AICHI001_THP1_4H_N14_GW-5074_2.5uM up | 1.34e-08 | 1.13e-02 | 3.14e+00 | 34 | False | 712 |
Table 10: Ranked LINCS L1000 signatures predicted to mimic the uploaded geneset.
Figure 26: barplot representation depicting the -log10p values of the top l2s2_all mimickers. Red bars represent statistically significant results; otherwise gray.
drugseqr_fda
Caught error: No Results for drugseqr_fda
drugseqr_all
Caught error: No Results for drugseqr_all
References¶
[1] Jiang, X., Huang, K., Sun, X., Li, Y., Hua, L., Liu, F., Huang, R., Du, J., & Zeng, H. (2024). Hexamethylene amiloride synergizes with venetoclax to induce lysosome-dependent cell death in acute myeloid leukemia.. iScience, 27(1), 108691.
[2] McInnes L, Healy J, Saul N, Großberger L. UMAP: Uniform manifold approximation and projection. Journal of Open Source Software. 2018;3(29):861. doi:10.21105/joss.00861
[3] Clark NR, Ma’ayan A. Introduction to statistical methods to analyze large data sets: Principal Components Analysis. Science Signaling. 2011;4(190):tr3-tr3. doi:10.1126/scisignal.2001967
[4] van der Maaten L, Hinton G. Visualizing Data using t-SNE. Journal of Machine Learning Research. 2008;9(86):2579-2605.
[5] Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, Clark NR, Ma'ayan A. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;128(14)
[6] Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, McDermott MG, Monteiro CD, Gundersen GW, Ma'ayan A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Research. 2016; gkw377.
[7] Xie Z, Bailey A, Kuleshov MV, Clarke DJB., Evangelista JE, Jenkins SL, Lachmann A, Wojciechowicz ML, Kropiwnicki E, Jagodnik KM, Jeon M, & Ma’ayan A. Gene set knowledge discovery with Enrichr. Current Protocols, 1, e90. 2021. doi: 10.1002/cpz1.90
[8] Keenan AB, Torre D, Lachmann A, Leong AK, Wojciechowicz M, Utti V, Jagodnik K, Kropiwnicki E, Wang Z, Ma'ayan A (2019) ChEA3: transcription factor enrichment analysis by orthogonal omics integration. Nucleic Acids Research. doi: 10.1093/nar/gkz446
[9] Marino GB, Evangelista JE, Clarke DJB, Ma’ayan A. L2S2: chemical perturbation and CRISPR KO LINCS L1000 signature search engine. Nucleic Acids Res. 2025; gkaf373. doi:10.1093/nar/gkaf373
[10] Li J, Ho DJ, Henault M, et al. DRUG-seq Provides Unbiased Biological Activity Readouts for Neuroscience Drug Discovery. ACS Chem Biol. 2022;17(6):1401-1414. doi:10.1021/acschembio.1c00920
[11] Lachmann A, Torre D, Keenan AB, Jagodnik KM, Lee HJ, Wang L, Silverstein MC, Ma'ayan A. Massive mining of publicly available RNA-seq data from human and mouse. Nature Communications 9. Article number: 1366 (2018), doi: 10.1038/s41467-018-03751-6.
[12] Bray, N., Pimentel, H., Melsted, P. et al. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34, 525–527 (2016). https://doi.org/10.1038/nbt.3519
[13] Fernandez, N. F. et al. Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data. Sci. Data 4:170151 doi: 10.1038/sdata.2017.151 (2017).
[14] Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015 Apr 20;43(7):e47. doi: 10.1093/nar/gkv007.
[15] Milacic M, Beavers D, Conley P, Gong C, Gillespie M, Griss J, Haw R, Jassal B, Matthews L, May B, Petryszak R, Ragueneau E, Rothfels K, Sevilla C, Shamovsky V, Stephan R, Tiwari K, Varusai T, Weiser J, Wright A, Wu G, Stein L, Hermjakob H, D’Eustachio P. The Reactome Pathway Knowledgebase 2024. Nucleic Acids Research. 2024. doi: 10.1093/nar/gkad1025.
[16] Eppig JT, Smith CL, Blake JA, Ringwald M, Kadin JA, Richardson JE, Bult CJ. Mouse Genome Informatics (MGI): Resources for Mining Mouse Genetic, Genomic, and Biological Data in Support of Primary and Translational Research. Methods Mol Biol. 2017;1488:47-73. doi: 10.1007/978-1-4939-6427-7_3.
[17] Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000 May;25(1):25-9. doi: 10.1038/75556.
[18] Cerezo M, Sollis E, Ji Y, et al. The NHGRI-EBI GWAS Catalog: standards for reusability, sustainability and diversity. Nucleic Acids Res. 2025;53(D1):D998-D1005. doi:10.1093/nar/gkae1070
[19] Kanehisa M, Furumichi M, Sato Y, Matsuura Y, Ishiguro-Watanabe M. KEGG: biological systems database as a model of the real world. Nucleic Acids Res. 2025;53(D1):D672-D677. doi:10.1093/nar/gkae909
[20] Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27-30. doi:10.1093/nar/28.1.27
[21] Kanehisa M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 2019;28(11):1947-1951. doi:10.1002/pro.3715
[22] Pico AR, Kelder T, van Iersel MP, Hanspers K, Conklin BR, Evelo C. WikiPathways: pathway editing for the people. PLoS Biol. 2008 Jul 22;6(7):e184. doi: 10.1371/journal.pbio.0060184.
[23] GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013 Jun;45(6):580-5. doi: 10.1038/ng.2653.
[24] ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57-74. doi:10.1038/nature11247
[25] Luo Y, Hitz BC, Gabdank I, et al. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Res. 2020;48(D1):D882-D889. doi:10.1093/nar/gkz1062
[26] Hammal F, de Langen P, Bergon A, Lopez F, Ballester B. ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an integrative analysis of DNA-binding sequencing experiments. Nucleic Acids Res. 2022;50(D1):D316-D325. doi:10.1093/nar/gkab996